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Abstract. Astronomers are good at sharing data, but poorer at sharing knowledge. 

Almost all astronomical data ends up in open archives, and access to these is being 
simplified by the development of the global Virtual Observatory (VO). This is a great 
advance, but the fundamental problem remains that these archives contain only basic 
observational data, whereas all the astrophysical interpretation of that data - which 
source is a quasar, which a low-mass star, and which an image artefact - is contained 
in journal papers, with very little linkage back from the literature to the original data 
archives. It is therefore currently impossible for an astronomer to pose a query like 
"give me all sources in this data archive that have been identified as quasars" and this 
limits the effective exploitation of these archives, as the user of an archive has no direct 
means of taking advantage of the knowledge derived by its previous users. 

The AstroDAbis service aims to address this, in a prototype service enabling as- 
tronomers to record annotations and cross-identifications in the AstroDAbis service, 
annotating objects in other catalogues. We have deployed two interfaces to the annota - 
tions, namely one astronomy-specific one using the TAP protocol dDowler et al. 2010), 
and a second exploiting generic Linked Open Data (LOD) and RDF techniques. 



1. Introduction 

The AstroDAbis service provides a stand-off annotation service for astronomical cat- 
alogue entries. Catalogues appear in many forms, and at scales ranging from tables 
in journal articles (later made available electronically by the journals, in some cases) 
to large software-engineering efforts on the part of specialised archives. At all scales, 
however, there are three key problems when working with archives. 

1. Catalogues contain information, but the knowledge derived from analysis of 
them resides elsewhere, typically only in journal articles. This separation is intentional 
and well-motivated - a catalogue contains only values for readily measurable quanti- 
ties, and is, therefore, viewed as objective, while an astronomer's judgement comes 
into play when those measurements are interpreted astrophysically - but the relation- 
ship between them is, typically, asymmetric: through hyperlinks provided by ADS 
(http://adsabs.harvard.edu/) a journal article can point to the online catalogue(s) 
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used in its analysis, but the data archives hosting the catalogue(s) do not implement 
analogous pointers to the additional information about catalogue entries that is present 
in the online literature. 

2. Objects in the sky do not come with unique labels. Combining information from 
different catalogues requires cross-matching. This is expensive, and up to now the only 
way of speeding it up, by either preserving the results of cross-matches, or providing 
facilitating information on neighbour distances, has been to publish (effectively) a new 
catalogue which, because it must be in an archive-specific format and location, is hard 
to reuse widely. 

3. Catalogues are static objects. Any such fresh catalogue, derived from one 
or more existing catalogues but with the addition of deduced or fresh information, is 
logically independent from its progenitors. Although there is a chain of provenance, of 
course, the relationship of the new information to the old is not available to the machine. 

AstroDAbis addresses these three problems. 

1 . It provides a tagging interface which allows users to associate annotations to cat- 
alogue objects. This 'folksonomy' tagging is inevitably imprecise, but (a) this 
imprecision may be acceptable in some circumstances, and (b) a very closely 
analogous mechanism will allow users to associate more semantically sophisti- 
cated annotations to objects, when consensus emerges on what such annotations 
should look like. 

2. As a fundamental part of its design, the AstroDAbis service will implicitly create 
URI names for every object in every catalogue it knows about. This may seem 
profligate, or even impertinent, but (a) the service will be able to declare equiv- 
alences to any URI names that a catalogue already supports, and (b) as well as 
supporting cross-match tables, this creates the 'raw materials' for other exper- 
iments deploying the Semantic Web within astronomy. Although it is not part 
of TAP at present, one could imagine an extension to TAP which documented a 
service's preferred pattern for URIs naming the objects it contains. 

3. Stand-off tagging enables astronomer users to annotate catalogues, and objects 
in catalogues, to which they have no write access. This creates the possibil- 
ity of Web 2.0 or Semantic Web infrastructures without requiring catalogues to 
make the potentially disruptive changes to their systems which built-in annota- 
tion would demand. One implication of the Linked Data interface to the service 
(see http://linkeddata.org) is that we provide RDF information about both 
the catalogue objects, and the celestial objects they refer to, in a flexible and 
open-ended way. 

The use of annotations to enrich existing data resources is not new to science 
or to the Web world; indeed sites such as delicious.com or Flickr are primarily con- 
cerned with such annotation in the form of 'tagging', and the associated notion of 
'folksonomy'. Delicious-style tagging is one of the inspirations of the AstroDAbis 
project but the more immediate one is the Distributed Annotation System (DAS, http : 
//www.biodas . org), which is a widely-used protocol for exchanging annotations on 
genomic and protein sequences. This system inspired pre vious work invo lving one of 
us (Mann), on the development of the AstroDAS system (Bose et alJ 2006). which pro- 
totyped the recording and publication of annotations of astronomical catalogues. As- 
troDAS was a successful proof-of-concept, but the immaturity of the VO protocol suite 
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at that time meant that it could not be implemented using open standards developed by 
the IVOA, and this limited its utility. Since then, the IVOA has released TAP, which 
provides a standard means of accessing tabular astronomical datasets, and important 
catalogues are becoming published to the VO through TAP services. 

At the time of writing the service is available as a prototype, but as it matures 
(during 2012) we plan to bring it up to a supported service, hosted by the Wide Field 
Astronomy Unit at Edinburgh. 



2. Using the service 



2.1. TAP Factory and OGSA-DAI 



Although it is not a depend ency, the AstroDA bis system was designed to be naturally 
usable with a TAP Factory (iHume et al.ll201 lh based on OGSA-DAI (a framework for 
distributed data and query management; see http://www.ogsadai.org.uk/). Using 
the TAP factory a service provider can create a service, with a TAP interface, which 
allows a user to make a TAP query which refers to multiple other TAP services. The 
OGSA-DAI service then decomposes the query into a group of single- service queries, 
and re-combines the result streams into a single result set which it passes back to the 
end-user. By this means, and as illustrated in Table [TJ an astronomer user can easily 
create a query which uses observational information from one catalogue along with 
annotation, identification or neighbour information from the AstroDAbis service. 
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Figure 1 . The OGSA-DAI architecture and AstroDAbis 



By providing a simple annotation service, the AstroDAbis mechanism has the po- 
tential to support annotation of a very broad range of astronomical objects, in a very 
broad range of repositories. 

2.2. Adding and retrieving annotations 

The service supports a web-based interf ace, which allow s a user to enter templated 
queries (which expand to ADQL queries dOrtiz et alj|2008l) ). tagging the objects which 
result. Alternatively (and more suitably for batch-mode or bulk annotation), users can 
upload annotations contained in a VOTable, as illustrated in Table [TJ 

Since the AstroDAbis service exposes a TAP interface to the world, its annotation 
information is available through ADQL interfaces similar to the one illustrated. 

The TAP interface makes the AstroDAbis a first-class citizen in the VO, so that its 
users' annotations can be combined with information from other VO services to support 
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SELECT TOP 100 masterObjID as pts_key, 

slaveObjID as objID, distanceMins as tagvalue 
FROM twomass_pscXBestDR7PhotoObjAll 
<FIELD name='pts_key ID='masterObjID' 

ucd= ' meta . id ; meta . main ' datatype^ ' long ' > 
<DESCRIPTION>The unique ID in twomass_psc</DESCRIPTION> 
</FIELD> 

<FIELD name='objID' ID=' slaveObjld' 

ucd= ' meta . id ; meta . dataset ' datatype= ' long ' > 
<DESCRIPTION>The unique ID of the neighbour 

in BestDR7. .PhotoObjAll (=objID)</DESCRIPTION> 

</FIELD> 

<FIELD name^' tagvalue' ID= ' distanceMins ' 

ucd='pos.angDistance' datatype^ ' float ' unit='arcminutes'> 
<DESCRIPTION>Angular sep. between neighbours</DESCRIPTION> 
</FIELD> 



Table 1 . An ADQL query which creates a VOTable, which can subsequently be 
uploaded to the AstroDAbis service to create a two-object annotation. 

high-level queries such as, for example, "find me the redshifts of all the objects which 
Fred Bloggs identifies as quasars". 

As well as the TAP-based interface, AstroDAbis has a 'Linked Data' interface. 
Although this provides utility by itself, it additionally provides a mechanism for creat- 
ing URI-based names for the objects in the catalogues it annotates. These can act as a 
springboard for future experiments with the Semantic Web in astronomy. 

2.3. Further information 

See http://code.google.eom/p/astrodabis/for project source code and documen- 
tation. 
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